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What is it? 


Daring to voice new sounds, words, and phrases is an essential part of learning 
to speak a language. However, getting students, particularly in mono-lingual 
classes, to try to speak a foreign language can be a significant challenge. 
Voice interaction assistants, such as Siri, Alexa, or Google Assistant, offer new 
opportunities to create meaningful, fun tasks for language learning that require 
accurate spoken production. Designing good tasks requires an understanding 
of the learning context and needs as well as the interactional opportunities, 
constraints, and risks associated with any particular technology. 


Recent studies suggest that instead of imagining home assistant voice interfaces 
as conversational, designers should think in terms of single turn request and 
response dialogues in which the response often serves as a resource that supports 
some other ongoing activity. For example, asking ‘Alexa, how do you spell 
awkward?’ while writing an essay by hand, or checking a fact (‘Hey Google, 
what’s the population of London?’) while arguing with another human. People 
mainly use voice interaction to get things done quickly and easily, to support 


1. British Council, Bilbao, Spain; josh.underwood@gmail.com; https://orcid.org/0000-0002- 1486-0429 
How to cite: Underwood, J. (2021). Speaking to machines: motivating speaking through oral interaction with intelligent 


assistants. In T. Beaven & F. Rosell-Aguilar (Eds), Innovative language pedagogy report (pp. 127-132). Research- 
publishing.net. https://doi.org/10.14705/rpnet.2021.50.1247 


© 2021 Joshua Underwood (CC BY) 127 


Chapter 20. Speaking to machines 


other activities, and for social fun. We explore pedagogic opportunities created 
by this kind of interaction; speaking to machines rather than speaking with 
machines (see Satar, this volume). 


Example 


Teachers quickly saw opportunities to use Intelligent Assistants (IA) as 
classroom assistants and to motivate speaking. Examples include: setting timers 
and playing background music for activities; as a resource to support daily 
routines — e.g. finding out about today’s weather in a different part of the world; 
and asking for spellings, definitions, synonyms, or checking facts to support 
individual or group work. 


Benefits 


Students, particularly young learners, often find this kind of interaction 
motivating and want to try their hand at getting a machine to do something using 
their voice. What is more, IAs can potentially answer factual questions teachers 
may not know the answers to, thus supporting students’ curiosity. They can also 
act as resources to support group work and student-to-student conversations 
— imagine a device per group, thus potentially freeing up teachers to monitor, 
listen, and help more. 


One way that language teachers can exploit these opportunities for language 
development is by designing tasks that push students to produce vocabulary or 
language structures we want them to start to acquire. This might be in the form 
of written worksheets designed to scaffold groups doing IA assisted research, 
e.g. find out about and compare two countries with prompts such as, population, 
climate, typical foods, capital city, etc. Students need to help each other 
formulate and produce accurate enough questions to get the information they 
need. Failures can prompt students to reflect on the accuracy of their own and 
others’ speech, self and peer-correct, try again, and ask teachers and one another 
for help. Such tasks give students a reason to produce and hear one another 
speaking the target language and may lead to them speaking it with one another. 
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Additional benefits of these kinds of activities are that efficient interaction with 
IAs requires students to listen carefully, as there is no visual feedback, and to 
think about and respect turn-taking, useful skills to work on in any language 
learning classroom. 


Devices with screens and voice interaction offer different opportunities. For 
example, students might ask ‘show me a picture of an artichoke’ to support 
understanding while reading or listening. This not only helps students make 
multimodal associations but also gives feedback on their pronunciation as the 
device displays what it ‘thinks’ they said as text. This information can prompt 
learners to notice errors, self-correct, and/or ask for help. Teachers can also 
design tasks to help students notice typical sound difficulties e.g. “show me a 
picture of a ship/sheep, cup/cap, lorry/lolly’. 


Voice interaction can also be associated with physical changes, such as turning 
the lights off, thus creating multimodal and memorable associations. A long 
history of robot-assisted language learning suggests young learners may find 
speaking a foreign language to a robot much less intimidating than speaking 
to a human teacher or peer. Also, robots can move in response to speech and, 
particularly those that can recognise attention and simulate emotions, may 
encourage learners to make emotional associations with the language they use 
with possible benefits for meaningfulness and memory. Though this also raises 
ethical issues and opportunities to engage with these. 


Potential issues 


Many of the technologies mentioned are designed for individual use in a first 
language rather than for groups of language learners in educational settings. 
There are consequent issues and opportunities to resolve these. 


Firstly, education systems need devices that comply with data protection 
regulations. Teachers also need materials that help them and their students 
discuss the risks and opportunities of voice interaction and agree on appropriate 
uses. This can lead to useful explorations of what any particular IA is capable 
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of: Where are the opportunities and limitations? What questions can we ask to 
test our ideas? Learning to live with Artificial Intelligence (AI) and speak to 
machines seems likely to be an essential skill for the future. 


Secondly, language learners and teachers need tailor-made Voice User Interfaces 
(VUIs). VUIs that not only support engaging tasks but also cope well with accents, 
typical classroom interactions (What does... mean?, Could you explain...?, 
Could you say that again?, etc.), multilingual input (e.g. How do you say... 
in...?), respond using language appropriate to a learner’s competence level, and 
capture data useful for feedback on language. Teachers also need to reassess their 
roles in such an environment: how best can I use my time in this environment? 
What do I need to do to help students make the best use of these devices? For 
example, trying to and failing to communicate with an IA can quickly become 
frustrating. Teachers need to monitor, help overcome difficulties, and keep the 
atmosphere one of playful experimentation with new language. 


Thirdly, with ethical automated data capture, opportunities to support teachers 
in assessment and in providing helpful feedback on speaking activities open 
up. Reversing the human request followed by device response model, one can 
imagine pairs of students in a class engaging in speaking tasks in response to 
device requests, e.g. I’d like you to speak for two minutes about.... It’s very 
hard for a human teacher to monitor many simultaneous conversations in a 
classroom and students may well not feel they are being listened to and go off 
task. This situation might be improved by devices capturing what students say 
and providing transcripts, potentially with automated highlighting of possible 
errors and suggestions about opportunities to improve vocabulary range. Such 
information might be used by teachers and/or students to notice opportunities for 
improvements and provide motivating and helpful feedback. 


To support this kind of human-machine collaboration, teachers and learners 
need to be involved in understanding the opportunities and risks, agreeing on 
acceptable uses, and designing desirable ways roles might be shared with ‘cobot 
teachers’. This kind of conversation in turn can lead to a useful reassessment of 
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what makes humans and human communication different to machines and what 
makes human teachers special. 


Looking to the future 


Here we have focused on speaking to machines, rather than with 
machines, and on motivating speaking amongst groups of learners 
in classroom settings. This is about creating an atmosphere that 
encourages speaking in the target language and fosters human-to- 
human activity and conversation. Here technology does not replace 
teachers but rather acts as a helpful resource. 


This is a distinct opportunity to the more conversational and 
individualised uses of voice interaction in environments like Alelo’s 
Enskill. Opportunities for conversation and freer-speaking practice 
with Als are undoubtedly coming (see Google’s recent Meena 
chatbot experiments), though interactions with these may too be 
exploited for classroom and group learning and help us to focus 


on and identify what is special and different about speaking with a 


human. 
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